Queueing with redundant requests: exact analysis
نویسندگان
چکیده
Recent computer systems research has proposed using redundant requests to reduce latency. The idea is to run a request on multiple servers and wait for the first completion (discarding all remaining copies of the request). However, there is no exact analysis of systems with redundancy. This paper presents the first exact analysis of systems with redundancy. We allow for any number of classes of redundant requests, any number of classes of non-redundant requests, any degree of redundancy, and any number of heterogeneous servers. In all cases we derive the limiting distribution of the state of the system. In small (two or three server) systems, we derive simple forms for the distribution of response time of both the redundant classes and non-redundant classes, and we quantify the “gain” to redundant classes and “pain” to non-redundant classes caused by redundancy. We find some surprising results. First, the response time of a fully redundant class follows a simple exponential distribution and that of the non-redundant class follows a generalized hyperexponential. Second, fully redundant classes are “immune” to any pain caused by other classes becoming redundant. We also compare redundancy with other approaches for reducing latency, such as optimal probabilistic splitting of a class among servers (Opt-Split) and join-the-shortest-queue (JSQ) routing of a class. We find that, in many cases, redundancy outperforms JSQ and Opt-Split with respect to overall response time, making it an attractive solution.
منابع مشابه
S&X: Decoupling Server Slowdown (S) and Job Size (X) in Modeling Job Redundancy
Recent computer systems research has proposed using redundant requests to reduce latency. The idea is to replicate a request so that it joins the queue at multiple servers. The request is considered complete as soon as any one copy of the request completes. Redundancy is beneficial because it allows us to overcome server-side variability – the fact that the server we choose might be temporarily...
متن کاملStochastic Bandwidth Packing Process: Stability Conditions via Lyapunov Function Technique
We consider the following stochastic bandwidth packing process: the requests for communication bandwidth of different sizes arrive at times t = 0, 1, 2, . . . and are allocated to a communication link using “largest first” rule. Each request takes a unit time to complete. The unallocated requests form queues. Coffman and Stolyar [6] introduced this system and posed the following question: under...
متن کاملThe MDS Queue: Analysing Latency Performance of Codes and Redundant Requests
In order to scale economically, data centers are increasingly evolving their data storage methods from the use of simple data replication to the use of more powerful erasure codes, which provide the same level of reliability as replication-based methods at a significantly lower storage cost. In particular, it is well known that MaximumDistance-Separable (MDS) codes, such as Reed-Solomon codes, ...
متن کاملPerformability Modelling of Distributed Systems using Layered Queueing Networks
Proliferation of large and complex fault-tolerant distributed systems in recent years has stimulated the combined modelling of performance and dependability of such systems. For large systems it may be very expensive to compute valid performance estimates to be used in the combined performability measures. This work considers two different classes of fault-tolerant client-server systems, in whi...
متن کاملHeavy Tails in Queueing Systems: Impact of Parallelism on Tail Performance
In this paper we quantify the efficiency of parallelism in systems that are prone to failures and exhibit power law processing delays. We characterize the performance of two prototype schemes of parallelism, redundant and split, in terms of both the power law exponent and exact asymptotics of the delay distribution tail. We also develop the optimal splitting scheme which ensures that split alwa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Queueing Syst.
دوره 83 شماره
صفحات -
تاریخ انتشار 2016